feat: migrate control plane to vercel webhook and consolidate agents on cloud runs#400
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Round 2: Vercel control plane wiringThis round adds the control-plane wiring that turns the scaffolding from round 1 into an end-to-end PR-flow runtime. Every PR-triggered webhook now flows through the Vercel control plane (webhook → builder → cloud-agent dispatch → KV → cron → apply-result-to-GitHub). Commits
What's live
The issue-triggered workflows and the plan-approval workflows are still routed by Validation
Operator notes
The new Out of scope
The legacy GitHub Actions paths still pass all tests and continue to handle every workflow during the cutover. |
|
I'm running I couldn't run Powered by Oz |
|
I'm working on changes requested in this PR (responding to a PR conversation comment). You can view the conversation on Warp. I pushed changes to this PR based on the comment. Next steps:
Powered by Oz |
Fix
|
| Suite | Result |
|---|---|
python -m pytest control-plane/tests |
120 passed, 14 subtests passed |
PYTHONPATH=.github/scripts python -m unittest discover -s .github/scripts/tests |
494 passed |
python -m compileall control-plane/api control-plane/lib .github/scripts/oz_workflows |
clean |
|
Pushed What was failing
Fixes
Vercel deployment — likely needs project-side changeThe Vercel deployment fails almost immediately, which usually means Vercel rejected the build before running any commands. The previous The most likely cause is the Vercel project's Root Directory setting (Project Settings → General → Root Directory) — if it's still pinned to |
Co-Authored-By: Oz <oz-agent@warp.dev>
d2df840 to
a3810bc
Compare
Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
Rename the shared oz_workflows package to oz and move executable workflow modules from lib/scripts into lib/workflows. Update imports, documentation, specs, and tests to use the new package layout. Co-Authored-By: Oz <oz-agent@warp.dev>
Move the control-plane package from lib to core and update runtime imports, PYTHONPATH configuration, docs, and tests to match the new package layout. Co-Authored-By: Oz <oz-agent@warp.dev>
|
I'm starting a first review of this pull request. You can follow along in the session on Warp. I completed the review and posted feedback on this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR replaces the legacy GitHub Actions/Docker workflow surface with a Vercel webhook and cron control plane, cloud-run workflow adapters, and artifact-based handoff for review, triage, spec, and implementation paths.
Concerns
- The implementation apply path accepts any agent-reported branch that merely starts with the expected target branch, which can select a sibling issue branch.
- The create-spec and create-implementation apply paths lost the previous one-minute timestamp cushion when checking whether the agent pushed changes, so successful cloud runs can be treated as no-ops under normal GitHub/Oz clock skew.
Security
- The branch override is derived from an agent-produced artifact after the agent has read untrusted issue content; constrain it to the exact target branch or a delimiter-bounded slug before using it to update or open PRs.
Verdict
Found: 0 critical, 3 important, 0 suggestions
Request changes
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
Constrain implementation branch overrides to the expected branch or a delimiter-bounded suffix and restore the one-minute timestamp cushion when checking agent-pushed spec and implementation branches. Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
…rkflows Addresses Oz's CHANGES_REQUESTED on this PR: the five workflow files restored in the previous commit reference composite actions (`build-triage-image`, `run-oz-python-script`) and Python entrypoints that were also deleted by PR warpdotdev#400. Without these, the workflows would still fail at action resolution before any script could run. This commit restores the dependency closure for those five workflows, all from commit 6ffca63 (the parent of the deletion commit 6a5ac7c), scoped to what the legacy issue-triggered helpers actually need: Composite actions: - .github/actions/build-triage-image/action.yml - .github/actions/run-oz-python-script/action.yml Python entrypoints (one per restored workflow): - .github/scripts/respond_to_triaged_issue_comment.py - .github/scripts/comment_on_unready_assigned_issue.py - .github/scripts/update_dedupe.py - .github/scripts/update_pr_review.py - .github/scripts/update_triage.py Shared library used by the entrypoints: - .github/scripts/oz_workflows/{__init__,actions,artifacts,docker_agent,env,helpers,oz_client,repo_local,triage,verification,workflow_config,workflow_paths}.py - .github/scripts/requirements.txt Triage container (built by `build-triage-image`, run by docker_agent.py): - docker/triage/{Dockerfile,README.md,entrypoint.sh} - uv.toml Intentionally left deleted, because PR warpdotdev#400 explicitly migrated them to the Vercel webhook control plane: - review-pull-request workflow + review_pr.py + build-review-image action + docker/review/ - enforce-pr-issue-state + enforce_pr_issue_state.py - respond-to-pr-comment + respond_to_pr_comment.py - verify-pr-comment + verify_pr_comment.py - triage-new-issues + triage_new_issues.py - resolve_review_context.py - All test files (not strictly needed for runtime) Verification: - Grepped all restored Python files for imports; every `oz_workflows.*` module referenced is included in this restore. - Grepped all restored YAML for `uses: warpdotdev/oz-for-oss/...`; only `build-triage-image` and `run-oz-python-script` are referenced and both are restored. Refs: warpdotdev#418
…9843) ## Summary Removes two GitHub Actions adapter workflows that delegate to reusable workflows in `warpdotdev/oz-for-oss` that no longer exist: - `.github/workflows/respond-to-triaged-issue-comment-local.yml` - `.github/workflows/comment-on-unready-assigned-issue-local.yml` Both have been failing on every trigger since 2026-04-30, when [`warpdotdev/oz-for-oss#400`](warpdotdev/oz-for-oss#400) deleted the upstream targets as part of migrating to a Vercel webhook control plane. ## Context The [`oz-for-oss` maintainer confirmed](warpdotdev/oz-for-oss#418 (comment)) the upstream deletions were intentional and asked us to remove the warp-side adapters rather than restoring upstream. Per-workflow disposition (her words): - `respond-to-triaged-issue-comment` — "covered by the issue_comment hook" of the new Vercel webhook. - `comment-on-unready-assigned-issue` — "currently being rewired" upstream. Root-cause issue: [`warpdotdev/oz-for-oss#418`](warpdotdev/oz-for-oss#418). ## Verification Empirical confirmation that the new webhook is live and handling `@oz-agent` mentions on triaged warp issues — sub-15s response latency from the `oz-for-oss` bot, well below GHA cold-start times: - Issue #8642 (2026-05-01 15:24 UTC): `@oz-agent` → bot reply 13s later - Issue #9576 (2026-04-30 23:33 UTC): `@oz-agent` → bot reply 8s later - Issue #9688 (2026-04-30 23:33 UTC): `@oz-agent` → bot reply 8s later Meanwhile every fire of `respond-to-triaged-issue-comment-local.yml` since 2026-04-30 has logged `failure` (e.g. run [`25207261218`](https://github.com/warpdotdev/warp/actions/runs/25207261218)) — the GHA path is purely noise. For `comment-on-unready-assigned-issue` there is a brief coverage gap until oz-for-oss finishes rewiring the `issues.assigned` event handler. Failing-and-noisy is worse than absent, so removing now improves signal. ## Scope note Three additional broken adapters exist for `update-dedupe`, `update-pr-review`, `update-triage` (weekly scheduled skill-refresh workflows). Those are intentionally left in place pending confirmation from `oz-for-oss` on whether the new control plane runs equivalent scheduled jobs against warp, or whether the replacement scheduled agents are still upcoming work. Those three have **never** executed for warp (`gh run list` returns `[]` for each), so leaving them does not affect any current automation. ## Test plan - [x] No required status checks affected — verified via repo ruleset 15469325; only `Check CI results` is required. - [x] No PRs blocked: these workflows trigger on `issue_comment` and `issues`, never on `pull_request`. - [x] Empirical webhook coverage verified for `respond-to-triaged-issue-comment` (see Verification). - [ ] Reviewer to sanity-check that removing these workflows is acceptable given the brief coverage gap for `comment-on-unready-assigned-issue` flagged above.
Summary
This PR migrates the triage / respond-to-triaged / PR-review workflows off Docker and onto Warp-hosted cloud agent runs, and lays down a Vercel-based control plane that will eventually replace the GitHub Actions delivery surface entirely.
The cloud-mode rewrite is the immediately-active change: each of the three Python entrypoints now calls
run_agent(cloud) +build_agent_config(role="review-triage")and reads results through the newoz_workflows.artifacts.load_*_artifacthelpers. Their prompts now describe aoz artifact upload <name>.jsonhandoff instead of a/mnt/outputmount; security rules, output schemas, and skill references survive the rewrite verbatim. The Docker assets (docker/triage/,docker/review/,build-{triage,review}-imagecomposite actions,docker_agent.py,test_docker_agent.py) are deleted, and the three workflow YAMLs that used them are updated to drop theBuild … agent containerstep and forwardWARP_ENVIRONMENT_ID+WARP_REVIEW_TRIAGE_ENVIRONMENT_IDto the script step. The workflow YAMLs themselves are intentionally retained so the existing GitHub Actions delivery path keeps working through the cutover.The new
control-plane/Python project is the long-term target: a Vercel webhook handler that verifies HMAC-SHA256 signatures and routes events to a workflow handler, plus a 1-minute cron poller that reads in-flight run state from Vercel KV and applies completed cloud-agent results back to GitHub. Lib helpers cover signatures, routing, trust evaluation, dispatch, in-flight state, GitHub App token exchange, and the cron drain loop. The full architecture and deployment runbook live incontrol-plane/README.md.Workflows served by the Vercel webhook in this PR
The webhook + cron control plane now owns the following delivery surface, and the corresponding
.github/workflows/*.yml+.github/scripts/*.pyshims have been removed in favor of the cloud-mode helpers underlib/scripts/:review-pull-request(PR opened / ready_for_review / review_requested / labeled //oz-review).enforce-pr-issue-state(PR synchronize / edited).respond-to-pr-comment(@oz-agentmention on PRs and review threads).verify-pr-comment(/oz-verifyon PRs).triage-new-issues— newly added:issues.openedon non-triagedissues,@oz-agentmention on a non-triagedissue, andneeds-inforeporter replies all route through the webhook.@oz-agentmentions on already-triagedissues continue to flow through the legacyrespond-to-triaged-issue-commentGitHub Actions workflow until that workflow is migrated in a follow-up.Cutover steps after merge
control-plane/.vercel.jsondeclares the runtime, both functions, and the 1-minute cron schedule.OZ_GITHUB_WEBHOOK_SECRET,OZ_GITHUB_APP_ID,OZ_GITHUB_APP_PRIVATE_KEY,WARP_API_KEY,WARP_API_BASE_URL,WARP_ENVIRONMENT_ID,WARP_REVIEW_TRIAGE_ENVIRONMENT_ID,CRON_SECRET. Detail per-secret incontrol-plane/README.md.vercel_kvlazily.https://<project>.vercel.app/api/webhook. The webhook handler returns 202 with the routed workflow id so the Recent Deliveries UI stays green..github/workflows/*per plan §5a — once the Vercel control plane is verified end-to-end, delete the remaining legacy GitHub Actions YAMLs in a follow-up PR. This PR keeps the issue-triggered helpers (respond-to-triaged-issue-comment,create-spec-from-issue,create-implementation-from-issue,comment-on-*) and the plan-approval workflows in place so both delivery paths can be exercised in parallel during cutover.Validation
python -m pytest tests→ 191 tests passed, 47 subtests passed (signature verification, routing table, trust evaluation, dispatch, cron drain loop, builder lifecycle, handler wiring, triage prompt + apply helpers).PYTHONPATH=lib:.github/scripts python -m unittest discover -s .github/scripts/tests→ 294 OK (helpers, role parameter, named artifact loaders, cloud-mode triage/review/respond-to-triaged dispatch, and skill-section assertions; the triage-specific tests have moved totests/test_triage.py).Security Rules:blocks were diff-checked byte-for-byte againstmainto confirm the cloud rewrite did not weaken or relax the prompt-injection / output-schema rules.References
Plan id:
7e8e8b6a-9e8a-4cbf-ab95-dd37cf4cc44c.Conversation: https://staging.warp.dev/conversation/da08a6c7-4f86-4dac-99f6-3358cfe3258e
Run: https://oz.staging.warp.dev/runs/019dd6d9-cb65-73a2-8cfb-d03d37afd03a
Plans:
This PR was generated with Oz.